Make `thriftbp.NewBaseplateClientPool()` behave more like the `grpcbp` one #602

nanassito · 2023-02-16T18:50:20Z

💸 TL;DR

Changing the behavior of the pool initialization to that it retries to open connections until the required number is opened instead of failing on the first issue.

📜 Details

Prior to this PR the behavior of the thriftbp pool initialization was as followed:

Try up to initialConnections times to open a connection, but bail out at the first error.
If the number of connections that ended up being opened is lower than requiredInitialConnections then crash.
Otherwise proceed despite having less than initialConnections connections opened.

This lead to reliability issues for services as the pods sometimes end up starting with very few connections ready. These connections are opened inline with the requests but that adds extra latency and opportunity for failures.
In my opinion this also is a confusing behavior since we end up returning a lower number than configured by the user. It is explicit in the documentation but I still found more than a few people surprised by this behavior.

With this PR applied the new behavior is as followed:

Open up to requiredInitialConnections and retry if necessary until either success or context error
Open the rest of the initialConnections using the same behavior as before.

While not my ideal approach, this allows the user to easily increase the effectively-usable number on requiredInitialConnections while leaving the initialConnection behavior unchanged.

Rollout:

Once merged I will need to work with econ-be to update their calls as they are the only ones directly impacted by the behavior change.
Services will be able to start setting requiredInitialConnections to higher values than currently.

🧪 Testing Steps / Validation

✅ Checks

CI tests (if present) are passing
Adheres to code style for repo
Contributor License Agreement (CLA) completed if not a Reddit employee

pacejackson · 2023-02-24T17:13:53Z

clientpool/channel.go

+			i++
+		} else {
+			lastAttemptErr = err
+			log.Warnf("clientpool: error creating required client (will retry): %w", err)


I don't know that I would add a log.Warn here, I would worry about this causing a lot of log spam if there's an issue?

@fishy really wanted a way to make the error clear to users.

You can limit the log spam with a rate limiter:

https://sourcegraph.build.ue1.snooguts.net/github.snooguts.net/reddit-go/httpbp/-/blob/httpbp/middleware/helpers.go?L17:12

(it should be at whatever scope you want to collect together, so global if you want to limit the spam at a per process level)

added RL at 2s

pacejackson · 2023-02-24T17:22:44Z

thriftbp/client_pool_test.go

+			addrGen: func() thriftbp.AddressGenerator {
+				i := 0
+				return func() (string, error) {
+					i += 1
+					var err error
+					if i == 1 {
+						err = fmt.Errorf("something broke")
+					}
+					return ln.Addr().String(), err
+				}
+			}(),


🔕 Is there any other way for us to fake these errors other than hacking them into the AddressGenerator? This isn't really the kind of error we would expect to get since most of these just return a static string (we added it to be compatible with some old ads code).

Not a blocker, but it might be nice to have the errors be closer to errors that we would actually expect/not rely on something that is really just a backwards compatibility hack.

I don't know, that's how every other test I found does it so I used the same approach

clientpool/channel.go

kylelemons

I've been swamped, from a cursory review this looks fine. We might want to discuss always trying at least initialConnection times (i.e. not bailing on the first failure) as well, since I think that is an innocuous behavior change, but we can do that separately.

kylelemons · 2023-02-24T19:53:01Z

clientpool/channel.go

+			i++
+		} else {
+			lastAttemptErr = err
+			log.Warnf("clientpool: error creating required client (will retry): %w", err)


You can limit the log spam with a rate limiter:

https://sourcegraph.build.ue1.snooguts.net/github.snooguts.net/reddit-go/httpbp/-/blob/httpbp/middleware/helpers.go?L17:12

(it should be at whatever scope you want to collect together, so global if you want to limit the spam at a per process level)

Co-authored-by: Andrew Boyle <[email protected]>

nanassito force-pushed the thriftbp.init branch 7 times, most recently from 842275c to 242517c Compare February 16, 2023 23:56

nanassito marked this pull request as ready for review February 23, 2023 22:37

nanassito requested a review from a team as a code owner February 23, 2023 22:37

nanassito requested review from fishy, kylelemons and mterwill and removed request for a team February 23, 2023 22:37

mterwill removed their request for review February 24, 2023 14:57

nanassito requested a review from pacejackson February 24, 2023 16:08

pacejackson approved these changes Feb 24, 2023

View reviewed changes

kylelemons approved these changes Feb 24, 2023

View reviewed changes

nanassito force-pushed the thriftbp.init branch from ea19e8b to c0a7191 Compare February 24, 2023 21:33

Dorian Jaminais-Grellier and others added 4 commits February 24, 2023 13:34

Retry the creation of the requiredInitialConnections

d806ec5

add tests

0517897

Update clientpool/channel.go

9b51e7d

Co-authored-by: Andrew Boyle <[email protected]>

Rate limit the warning logs

edad33a

nanassito force-pushed the thriftbp.init branch from c0a7191 to edad33a Compare February 24, 2023 21:34

kylelemons merged commit 6f97656 into reddit:master Feb 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make `thriftbp.NewBaseplateClientPool()` behave more like the `grpcbp` one #602

Make `thriftbp.NewBaseplateClientPool()` behave more like the `grpcbp` one #602

nanassito commented Feb 16, 2023 •

edited

Loading

pacejackson Feb 24, 2023

nanassito Feb 24, 2023

kylelemons Feb 24, 2023

nanassito Feb 24, 2023 •

edited

Loading

pacejackson Feb 24, 2023

nanassito Feb 24, 2023

kylelemons left a comment

kylelemons Feb 24, 2023

Make thriftbp.NewBaseplateClientPool() behave more like the grpcbp one #602

Make thriftbp.NewBaseplateClientPool() behave more like the grpcbp one #602

Conversation

nanassito commented Feb 16, 2023 • edited Loading

💸 TL;DR

📜 Details

🧪 Testing Steps / Validation

✅ Checks

pacejackson Feb 24, 2023

Choose a reason for hiding this comment

nanassito Feb 24, 2023

Choose a reason for hiding this comment

kylelemons Feb 24, 2023

Choose a reason for hiding this comment

nanassito Feb 24, 2023 • edited Loading

Choose a reason for hiding this comment

pacejackson Feb 24, 2023

Choose a reason for hiding this comment

nanassito Feb 24, 2023

Choose a reason for hiding this comment

kylelemons left a comment

Choose a reason for hiding this comment

kylelemons Feb 24, 2023

Choose a reason for hiding this comment

Make `thriftbp.NewBaseplateClientPool()` behave more like the `grpcbp` one #602

Make `thriftbp.NewBaseplateClientPool()` behave more like the `grpcbp` one #602

nanassito commented Feb 16, 2023 •

edited

Loading

nanassito Feb 24, 2023 •

edited

Loading